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Ethical issues 

Big Data ethics involves adherence to the concepts of right and wrong behavior regarding 
data, especially personal data. Big Data ethics focuses on structured or unstructured data collectors 
and disseminators. 

Big Data ethics is supported, at EU level, by extensive documentation, which seeks to find 
concrete solutions to maximize the value of Big Data without sacrificing fundamental human rights. 
The European Data Protection Supervisor (EDPS) supports the right to privacy and the right to the 
protection of personal data in the respect of human dignity. According to these documents, the 
conceptual conflict between privacy and Big Data, and between intimacy and innovation, must be 
overcome. It is essential to identify the ways of including the ethical dimension in the development of 
innovations. (European Economic and Social Committee 2017) 
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According to the new EU Regulation 2016/679, data operators must implement the 
confidentiality measures and technologies to improve the confidentiality when determining the 
processing modalities and the processing itself. Through ENISA75 many privacy strategies have been 
identified by design (data minimization, hiding personal data and their interconnections, separate 
processing of personal data, choosing the highest level of aggregation, transparency, monitoring, 
privacy policy, legal issues). 

A basic way for peaceful coexistence between Big Data exploitation and data protection is user 
control of personal data, which leads to transparency and trust between users and digital service 
providers. As outlined in the GDPR impact assessment, 

"Building trust in the online environment is key to economic development. Lack of tmst makes 
consumers hesitate to buy online and adopt new services, including public e-government 
services. If not addressed, this lack of confidence will continue to slow down the development 
of innovative uses of new technologies, to act as an obstacle to economic growth and to block 
the public sector from reaping the potential benefits of digitization of its services." (European 
Data Protection Supervisor, Opinion 7/2015 Meeting the challenges of Big Data A call for 
transparency, user control, data protection by design and accountability .) 

In the case of Big Data, traditional consent models are insufficient and outdated. The "consent 
should be granular enough to cover all the different processing and purposes of processing and reuse 
of personal data." (European Economic and Social Committee 2017) 

A special problem is data portability , supported at EU level by the EDPS in Opinion 7/2015, 
(MORO 2016) where it is necessary to guarantee the right of citizens to access and correct personal 
data through an expanded control. Data portability can help increase consumer awareness and control 
by transferring online services. 

The EDPS considers that personal data should be treated just like other important resources, 
such as oil, where the trading takes place between equally well-informed parties (informational 
symmetry). In fact, the market for personal information has a character of informational asymmetry, 
being neither transparent nor fair, customers are not compensated for the personal information they 
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provide. Thus, the portability of the data would encourage a more competitive environment among 
the beneficiaries of this data, the users having the possibility to choose who offers the personal data. 

Another approach involves the storage of personal data, with the possibility for the user to grant 
or withdraw consent for his personal data. (MORO 2016) (DG Connect 2015) The storage of personal 
data involves a "concept, framework, and architectural implementation that shifts data acquisition and 
control from a distributed data model to a user-centric model" (European Economic and Social 
Committee 2017) Data portability could ensure this. 

The EDPS supports promoting responsible beneficiaries and reducing bureaucracy in data 
protection, through codes of conduct, audits, certifications, and a new generation of contractual 
clauses and mandatory corporate rules. The responsibility of Big Data beneficiaries involves the 
establishment of internal policies and control systems in accordance with the legislation in force, 
through intelligent and dynamic solutions that guarantee the respect of fundamental principles (data 
minimization, purpose limitation, data quality, correct and transparent data processing, design, storage 
limitation, integrity and confidentiality). 

Data ethics is based on the following principles: ownership (individuals own their data), 
transparency of transactions (users must have transparent access to the algorithm design), consent (the 
user must be informed and expressly consent to the use of personal data privacy (user privacy must 
be protected), financial (the user should know the financial transactions resulting from the use of his 
personal data), and openness (aggregated data sets should be freely available). 

Ethics in research 

The term critical data studies (CDS) implies that researchers are investigating Big Data from 
critical perspectives. The study of data in this context involves, in addition to their analysis, the 
incorporation of data into practices (knowledge), political and economic institutions and systems, 
through the complex interaction between data and the entities that produce, own and use them. 
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An OECD report (2013) underlines that, unlike the ethical norms applied to common research 
data, in the case of Big Data: (OECD 2013) 

• Data collection was not subject to a formal ethical review process. 

• Common ethical mles will not be implemented in the case of Big Data 

• The use of research data may differ from the initial purpose. 

• Data is no longer held as discrete sets. 

The relationship between those who provide the data and those who use it is often indirect 
and variable. A more recent OECD report (2016) argues that this relationship is weaker or non¬ 
existent, with Big Data limiting common capabilities. (OECD 2016) 

Data storage is important for research integrity. The data must have a clear provenance, with 
known, identified and documented sources and processing. 

Many data that are not specifically collected for research have different standards in data 
research. 

For some data, often of commercial value (e.g., data collected on Twitter), there are legal 
restrictions on their reproduction. (UK Data Service 2017) 

Data storage must comply with standards of transparency and reproducibility. 

Awareness 

Awareness of the type of data that is provided during an online registration (for creating an 
account, or a subscription, for example) is a rare fact, especially since there is the possibility of using 
an existing digital identity (Facebook profile, for example) instead of a separate registration for faster 
access. Such situations create an opacity regarding the data shared between the identity provider and 
the service used. 
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Consent 

In order to use the personal data of a person, his or her informed and explicit consent is 
required regarding who, when, how and for what purpose they are used. When data needs to be shared, 
these uses must be made known to the person. It should always be possible to withdraw consent for 
future use. 

In Big Data analytics, very little can be known about the intended future uses of data, and 
about the benefits and involved risks. Here, there are procedures for "broad" and "generic" consent 
to share genomic data, for example, and for different purposes. Even when done correctly, there are 
some specific practical challenges: obtaining informed consent can be impossible or very costly, and 
the validity of consent is disputed when the agreement is required to access a service. 

Control 

In today's world, personal data can be traded just like any currency in Big Data implementation. 
There are different opinions to what extent this situation is ethical, including who to participate in the 
profit obtained from these transactions. 

In the trading model of personal data, the transmission of personal data is a framework that 
offers people the opportunity to control their digital identity and create granular agreements of data 
sharing. 

The idea of open data, centered around the argument that data should be freely available, is 
now emerging. Willingness to share data varies by person. 

In the case of children, parents or tutors have responsibility for their data, which cannot be 
traded for financial benefits. 

At national level, a government is sovereign over the generated and collected data. On October 
26, 2001, the Patriotic Act entered into force in the US, and on May 25, 2018, the General Data 
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Protection Regulation 2016/679 (GDPR) at the European Union level, for the issues related to the 
protection of personal data. 

In Big Data, the human-data relationship is asymmetrical, based on data control. The "right 
to be forgotten", adopted at EU level, is one of the basic elements of an individual's control over his 
personal data. 

Transparency 

Anticipatory governance involves Big Data-based predictive analytics to evaluate potential 
behaviors, with ethical implications that can encourage prejudice and discrimination. 

A person who accepts the inclusion of his personal data in Big Data has the right to know why 
the data is collected, how it will be used, how long it will be stored, and how it can be modified. 

Trust 

Confidence in Big Data systems is linked to interdependence with confidentiality and 
awareness. So far, tmst has been considered from a strictly technological perspective. It is hoped that 
hardware and software architectures will be developed that could increase tmst between human beings 
and objects, and thus a greater acceptance of the use of personal data. 

Ownership 

A fundamental question in the ethics of Big Data research is, who owns the data? This involves 
the subject of property rights and obligations. In European law, the GDPR indicates that people have 
own their own personal data. 

The sum of an individual's personal data forms a digital identity. 

The protection of the moral rights (the right to be identified as a source of data, and to control 
them) of an individual is based on the opinion that personal data are a direct expression of his 
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personality, and can only be transferred to another person, possibly, by succession when the individual 
dies. 

The property implies exclusivity, i.e. the implicit restriction of others regarding access to the 
property. An efficient ownership of personal data involves portability, the ability to use alternatives 
without losing data. Standardization would also help to clean up your personal data. 

At present, the data is owned by the owner of the sensors, the one who makes the recording 
or the entity that owns the sensor. 

In the EU, the possibility of EU citizens' data being stored outside the so-called "Euro cloud" 
has been progressively reduced, but the problem of data already stored and processed elsewhere has 
not been resolved, and "does not resolve the ethical dilemma of how data ownership is defined 
philosophically, before passing to a more down-to-earth approach of law and policy making.” 
(European Economic and Social Committee 2017) 

Surveillance and security 

More and more data sources are available with the help of advanced technologies such as 
CCTV, GPS, mobile devices, credit cards, ATMs. Also, active surveillance is a method of collecting 
data, but at the same time limiting the freedoms of citizens. Such permanent surveillance determines 
the increase of people's stress and creates their tendency to behave in a certain way that conforms to 
the expected norms. 

Digital identity 

Digital identity has the advantage of quick access to online content and related services. The 
use of digital identity has the potential to generate discrimination based on the representation of a 
person according to their online data, which may often not correspond to the real situation, in a 
process called "data dictatorship" in which "we are no longer judged on the basis of our actions, but 
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on the basis of what all the data about us indicates our probable actions may be", (Norwegian Data 
Protection Authority 2013) personal interaction not being placed in a secondary plan. 

Tailored reality 

Any interaction we have with the Internet implies the possibility of storing our personal data. 
The processing and analysis of this data determines the personalized results that appear later on the 
Internet, through our search results, the display of products in online stores, the display of 
advertisements, etc. This generates a narrower and more personalized version of a user's previous 
online experience (the so-called "filter bubble." (Pariser 2011) An advantage is that the user will quickly 
find what he or she usually looks for, but excluding certain aspects, perspectives and ideas can lead to 
a restriction of creativity and the development of a tolerant attitude through the political and social 
isolation of the other aspects, by the lack of pluralistic views. (Crawford, Gray, and Miltner 2014) 

De-identification 

De-identification involves deleting or hiding elements that could immediately identify a person 
or organization. Legislation in different countries on data protection defines different treatments for 
identifiable data. Identifiability is increasingly seen as a continuum, not a binary aspect. Disclosure 
risks increase simultaneously with the number of variables, data sources and the power of data analysis. 
Disclosure risks may be mitigated but not eliminated. De-identification remains a vital tool for 
ensuring the safe use of data. (UK Data Service 2017) 

Perfectly anonymous information taken separately can be combined with other data to 
uniquely identify a person with varying degrees of certainty. Profiling can become a powerful tool, 
raising concerns about the degree to which intrusion into an individual's life is allowed, the possibility 
of ensuring security, and surveillance. 
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Digital inequality 

The advantages of Big Data size are clear, but there are also opinions that the accumulation 
of data on a huge scale presents specific risks. Because of this, there are few entities that have access, 
through infrastructure and skills, to Big Data systems. In this context, the costs and skills needed for 
access lead to certain specific digital inequalities addressed by ethics. 

Privacy 

In data transactions it is very important to ensure confidentiality: 

"No one shall be subjected to arbitrary interference with his privacy, family, home or correspondence, 
nor to attacks upon his honour and reputation. Everyone has the right to the protection of 
the law against such interference or attacks.” - United Nations Declaration of Human Rights 
Article 12. 

In many countries, public monitoring of the data by the government to observe citizens 
requires explicit authorization through an appropriate judicial process. Privacy is not about keeping 
secrets, but about choice, human rights, and freedom. 

Often privacy is wrongly viewed as a binary choice between isolation and scientific progress. 
Identity protection in data is technologically possible, for example using homomorphic encryption 
and algorithmic design. 

Privacy as a limitation of the use of data can also be considered unethical, (Kostkova et al. 
2016) especially in healthcare, but it should be kept in mind that it is possible to extract the value of 
the data without compromising privacy. 

Privacy is recognized as a human right by numerous national and international regulations. 
Privacy in research is achieved through a combination of approaches: limiting the collected data, 
anonymizing them; and regulating access to data. In the case of Big Data research, specific problems 
arise: the ambiguity between the terms "privacy" and "confidentiality; the declaration of social spaces 
as public or private; the ignorance of the risks of privacy by users; the blurred distinction between 
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public and private users. Currently there are disputes whether data science it should be classified as a 
research of human subjects, and therefore not subject to the usual rules of privacy. 

Big Data research 

Through the new concepts of "algorithmic damage", "predictive analysis", etc., the algorithms 
currently used in Big Data operations go beyond the traditional view of privacy. According to the US 
National Science and Technology Council, 

"“Analytical algorithms” as algorithms for prioritizing, classifying, filtering, and predicting. Their use 
can create privacy issues when the information used by algorithms is inappropriate or 
inaccurate, when incorrect decisions occur, when there is no reasonable means of redress, 
when an individual’s autonomy is directly related to algorithmic scoring, or when the use of 
predictive algorithms chills desirable behavior or encourages other privacy harms.” (NSTC 
(National Science and Technology Council) 2016, 18) 

Big Data research is what the ethicist James Moor would call a "conceptual muddles" due to 
the "inability to properly conceptualize the ethical values and dilemmas at play in a new technological 
context." (Buchanan and Zimmer 2018) In this situation privacy is ensured through a combination of 
different tactics and practices (controlled or anonymous environments, limitation of personal 
information, anonymization of data, access restrictions, data security, etc.). In general, all related 
concepts become confusing in the case of Big Data. Thus, social posts are considered public on social 
networks in case of an appropriate setting. But social networks are complex environments of socio- 
technical interactions where users do not always understand the functionality of the settings and terms 
of use. Thus, there is uncertainty about users' intentions and expectations, and these conceptual 
deficiencies in the context of Big Data research lead to uncertainties regarding the need for informed 
consent. 
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